Bayesian Model Averaging for Improving Performance of the Naïve Bayes Classifier
نویسنده
چکیده
Feature selection has proved to be an effective way to reduce the model complexity while giving a relatively desirable accuracy, especially, when data is scarce or the acquisition of some feature is expensive. However, the single selected model may not always generalize well for unseen test data whereas other models may perform better. Bayesian Model Averaging (BMA) is a widely used approach to address this problem and there are several methods for implementing BMA. In this project, a formula appropriate for applying BMA on Naive Bayesian classifier is derived and the corresponding algorithm is also developed. We expect this modified classifier can always do better than Naive Bayes classifier and is competitive with other linear classifiers such as SVM and Logistic Regression, so a series of performance comparison on several different classifiers are made. To provide a comprehensive evaluation, we build all of the test classifiers under same framework API, use UCI data sets as data sources to train classifiers and apply cross-validation to assess statistical significance of the results. The results indicate three property of our modified classifier. Firstly, the prediction accuracy of our modified classifier is either better than Naive Bayes or, at least, equal to that of Naive Bayes and, in some cases, even better than SVM and Logistic Regression. Secondly, the running speed is faster than SVM and Logistic Regression. Thirdly, time complexity of the modified classifier is linear with respect to number of data as well as number of features. Keywords— Bayesian Model Averaging, Naive Bayes, Feature Selection.
منابع مشابه
Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes Theorem 17.7 Naïve Bayesian Classification 17.8 Bayesian Belief Networks
متن کامل
Effective Discretization and Hybrid feature selection using Naïve Bayesian classifier for Medical datamining
As a probability-based statistical classification method, the Naïve Bayesian classifier has gained wide popularity despite its assumption that attributes are conditionally mutually independent given the class label. Improving the predictive accuracy and achieving dimensionality reduction for statistical classifiers has been an active research area in datamining. Our experimental results suggest...
متن کاملAugmented Naïve Bayesian Model of Classification Learning
The Naïve Bayesian Classifier and an Augmented Naïve Bayesian Classifier are applied to human classification tasks. The Naïve Bayesian Classifier is augmented with feature construction using a Galois lattice. The best features, measured on their withinand between-category overlap, are added to the category’s concept description. The results show that space efficient concept descriptions can pre...
متن کاملCompression-Based Averaging of Selective Naive Bayes Classifiers
The naive Bayes classifier has proved to be very effective on many real data applications. Its performance usually benefits from an accurate estimation of univariate conditional probabilities and from variable selection. However, although variable selection is a desirable feature, it is prone to overfitting. In this paper, we introduce a Bayesian regularization technique to select the most prob...
متن کاملTractable Bayesian Learning of Tree Augmented Naive Bayes Models
Bayesian classifiers such as Naive Bayes or Tree Augmented Naive Bayes (TAN) have shown excellent performance given their simplicity and heavy underlying independence assumptions. In this paper we introduce a classifier taking as basis the TAN model and taking into account uncertainty in model selection. To do this we introduce decomposable distributions over TANs and show that they allow the e...
متن کامل